ApproXQL: Design and Implementation of an Approximate Pattern Matching Language for XML

نویسنده

  • Torsten Schlieder
چکیده

We introduce the simple query language approXQL, which supports hierarchical, Booleanconnected query patterns. The interpretation of approXQL queries is founded on cost-based query transformations: The total cost of a sequence of transformations measures the similarity between a query and the data and is used to rank the results. We describe in detail the implementation of the approXQL query processor, which uses an expanded query representation and sophisticated indexes to compute all results of a query in polynomial – typically sublinear – time with respect to the database size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Schema-Driven Evaluation of ApproXQL Queries

Query engines for heterogeneous collections of XML data should retrieve exact results – but also answers that are similar to the query. In this paper, we present a simple pattern matching language, which supports hierarchical, Boolean-connected query patterns. The interpretation of a query is founded on cost-based query transformations: The total cost of a sequence of transformations measures t...

متن کامل

Similarity Search in XML Data using Cost-Based Query Transformations

XML query engines should support structured queries. They should retrieve exact matches as well as results similar to the query. In this paper, we introduce the simple query language approXQL that supports hierarchical, Boolean-connected query patterns. The interpretation of approXQL queries is founded on cost-based query transformations: The total cost of a sequence of transformations measures...

متن کامل

Phil: A Lazy Implementation of a Language for Approximate Filtering of XML Documents

In this paper, we introduce a system, written in Haskell, for filtering information from XML data. Essentially, the system implements a simple declarative language which allows one to extract relevant data as well as to exclude useless and misleading contents from an XML document by matching patterns against XML documents. The matching mechanism employes a cost-based pattern transformation algo...

متن کامل

RUN, Xtatic, RUN: EFFICIENT IMPLEMENTATION OF AN OBJECT-ORIENTED LANGUAGE WITH REGULAR PATTERN MATCHING

RUN, Xtatic, RUN: EFFICIENT IMPLEMENTATION OF AN OBJECT-ORIENTED LANGUAGE WITH REGULAR PATTERN MATCHING Michael Y. Levin Benjamin C. Pierce Schema languages such as DTD, XML Schema, and Relax NG have been steadily growing in importance in the XML community. A schema language provides a mechanism for defining the type of XML documents; i.e., the set of constraints that specify the structure of X...

متن کامل

Adaptive Approximate Record Matching

Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001